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The first 102 results of the 


Penny Flipping II problem. 
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A Problem Solving Diary. 


This article will trace the steps involved ina 
straightforward computer solution to a non-trivial 
computing problem. The adventure being described is 
specifically in terms of an assembly-language program for 
the 6502 processor, but the language and machine are 
unimportant. What is important is the analysis and 
breakdown of the solution, and its organization. 


Admittedly, not many people are interested in 
assembly language coding (which, on an Apple II, comes 
pretty close to coding in absolute hexadecimal). But 
then, on the face of it, you wouldn't think that there 
would be Bay people devoted to duplicate bridge (there 
are millions) or beer-can collecting (over a hundred 
thousand) or crossword puzzles. 


Still, it is a fact that there are things to be done 
on the computer for which direct control of the machine is 
necessary. This is in contrast to working, say, in 
BASIC, where there are thousands of instructions between 
the user and the machine, and these instructions do things 
TO the user as well as FOR the user. The writer of those 
thousands of instructions made hundreds of decisions that 
the user must abide by, most of which he doesn't even know 
about. 
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Further, only through machine language can the inherent 
speed of the processor be capitalized on. The results 
given in Table W could be obtained in BASIC, perhaps, but 
only after many thousands of hours of execution. 


The problem selected as suitable for this article 
first appeared in our issue number 23 as Penny Flipping II: 


Given a stack of N pennies, initially 
sitting all heads up. Turn over (that 

is, flip) the top penny, then the bottom 2 
pennies, then the top 3 pennies, then the 
bottom 4 pennies,...,and so on until it is 
the entire stack of N that is flipped. 

After every flip, test to determine if the 
stack has returned to all heads. Continue 
with the top 1, the bottom 2, top 3, and so 
on. Count the number of flips to 


return the stack to all heads. 


Not only does this problem lend itself nicely to 
what we wish to demonstrate, but an enormous amount of 
work has already been done on it and there is a conspicuous 
gap in the known results. Except for N = 58, the value 
of the function (that is, the number of flips to return 
to all heads) is known for all N from 1 to 64. The 
function is highly irregular and hence intriguing. 


Consider: a stack of 98 pennies returns to all heads 
in 9603 flips, but a stack of 104 pennies takes over 18 
million flips. This is irregular indeed. 
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The algorithm carried out for N = 5. 
A work area of 5 words is cleared to 
zero (representing all heads). The 
words at the left of the stack of 
five words represent the "top" of 
the stack; the words on the right 
represent the "bottom" of the stack. 
For N = 5, the process returns to 
all heads in 20 steps. 
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Digression: One of our goals over the years has 
been to find problem situations for which the computer is 
the essential tool for solution. Many such attempts have 
been thwarted by extremely clever analytic solutions. 
Indeed, it may be that someone could devise a formula for 
the solution of this problem, but its derivation would have 
to depend on a great deal of data like that of Table W. 
I submit that to acquire even a small portion of Table W, 
it is necessary to use a computer; no other tool will do. 


The analysis of the proposed solution is summed up 
in the MAIN flowchart. The heart of the solution is the 
logic of flipping (essentially complementing the contents 
of a set of words and then inverting their order) K coins 
at either the "top" or the "bottom" of a set of N words, 
during the exploration of case N. The MAIN flowchart 
shows the overall (high level) logic of a solution (note 
that we are careful not to say "the" solution), which seems 
to lend itself to a group of sub-problems, listed in 
Figure P, each of which can be coded independently as 
subroutines. (Subroutine number one was an afterthought; 
it goes on the MAIN flowchart at C.) 
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A possible work area, at the high end of storage. 


Calculate the left address 
of the work area; that is, 


address COOO - K. 


Clear the work area 
to zeros. 


Flip the left K words. 


Flip the right K words. 


Test the work area for 
all zeros. 


Increment a 4-word 
counter, KK 


Display N and KK, 
Display the right 


hand 15 words of the 
work area. 


This address is needed by 
several other routines; it 
is efficient to code it 
once as a subroutine. 


For case N, only N words 
need be cleared. it is 
obviously easier to write 
a subroutine to clear, 
say, 127 words each time, 
from BF81 through BFFF. 


The complete logic for this 
is shown in Flowchart T. 


The logic closely parallels 
that of subroutine 3; it is 
not shown explicitly. 


Shown explicitly as 
Flowchart R. 


Shown explicitly as 
Flowchart Q. 


Used for debugging only. 
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In addition to segmenting the problem into a set of 
sub-problems (this is the most important use of subroutines) 
it is wise to prepare a map of the storage layout. Figure 
P shows the work area at its top. In addition, other 
words of storage can be allocated, something like this: 


2001 2002 


\ 


Four words needed for KK. 


C3 


Portion of a storage map used in preparing an assembly 
language program for the Penny Flipping II problem. 


The heart of the problem calls for counting to 
great heights. In an 8-bit machine like the 6502, it is 
expedient to use four words as a single counter, as shown 
in Flowchart Q. If each of the four words is limited 
to counting to 100 (instead of its natural limit of +127), 
the count is readily converted from hex to decimal. 


Flowchart R shows the basic logic of the test for 
all zeros in the work area. The result of the test is 
communicated back to the MAIN program via a trigger, T. 

T is set to one on entry to the subroutine and remains one 
if any word of the work area is non-zero. If all the 
words of the work area are zero, T is set to zero. 
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Subroutines 1 to 7 and the MAIN routine were written 
carefully and loaded into the machine. The signal to 
execute such a program (this one was about 250 instructions) 
is always exciting. Anything can happen, and that can 
include a perfect run right away--but long experience says 
that that eventuality is unlikely. 


In this case, the program did not run properly. It 
ran, to be sure, but produced the result 4 for every value 
of N. The program was written to beep at the time of 
displaying each result, so the first trial run produced 
rapid beeps and endless 4's. 


At this point, the breakdown of the solution into 
clearly defined subroutines really pays off. Even though 
the total program is clearly not working, the individual 
parts can be checked. Clearly, subroutine 5 (the test 
for all zeros in the work area) is already working fine, 
as is subroutine 7 (the display of results). It is easy 
to determine that subroutines 1, 2, 3, and 4 are already 
each doing their assigned task properly, but somehow the 
interactions are failing. 


So subroutine 8 was added to the program, to display 
the high-order words of the work area. This subroutine 
was called at the places marked A and B on the MAIN flow- 
chart. When the program was run again, the output from 
subroutine 8, for N = 10, showed: 


BONGO! © 6-01.60 


ZL (0) (0) (0) (0) Oy © 
OFOROROE OL OFORO 
Op0.0; © 0, 0. ONORGEO 

and this pattern was repeated for each new value of N. 


The bars over each line in the pattern indicate what 
should have been flipped. Actually, the flipping pattern 
seems to be top 1, bottom 2, top 1, bottom 2. The whole 
matter would be explained if the comparison shown at E in 
the MAIN flowchart was frozen on “greater than." Some 
study of the code revealed that that was exactly what was 
taking place; the comparison had been omitted entirely, 
but the branches based on the comparison were there. 


Corrections and changes in machine language are 
usually easy to make by an old technique called “out to 

the woods and back." At the place in the program where 

a patch is to be inserted, a branch is overlaid, replacing 
an old instruction; this branch is to an unused portion 

of storage, preferably at the end of the routine involved. 
Then, out in this "woods" area, the stepped-on instruction 

is replaced, together with the necessary patching group 

of instructions, followed by a branch back to the instruct- 
ton after the patch. It is all quite primitive, nostalgic, 
and thoroughly satisfying. It is good practice to label 
the “woods" end of the patch with a note as to where it 

came from. 


All patches should be made on the coding sheets ina 


new color. The rule is: when any routine reaches four 
levels of Technicolor, it is time to scrap it and start 
over. 


In our case, the missing COMPARE instruction was 
readily patched in, and another run was initiated, with 
N = 10. This produced immediate success: 


OA OO} 08 C2 Se 


The results were displayed in hexadecimal, of course, since 
that is the easy way to go, at least for a first try. The 
output translates quickly into: 


10 295 


which is correct (that is, it agrees with previously 
published results). 


Subsequent answers were not so pleasing. An exam- 
ination of the next 20 results (which took less than a 
minute to produce) showed that the answers for even values 
of N were all correct, but the answers for odd values of N 
were wildly wrong. Back to the MAIN flowchart: what's 
different about odd and even stacks of coins? it turns 
out to be this: an even stack usually ends at E, while an 
odd stack usually ends at D. The logic at D was wrong; 
it had been written to go to Reference 4, and it should 

go to Reference 3. 


So yet another patch was made, and the program then 
flew correctly. The results shown in Table W beyond 
N = 64 (previously published) were simply a matter of CPU 
time. The time to obtain each result should be 
proportional to the product of N and the number of flips. 
An actual production run showed these numbers: 


Time N times F 
in divided by 
Seconds time 

1034280 
54648 
67536 
138720 
4346 
260680 


3419888 


The logic of subroutine T shows one obvious short- 
cut at the place marked (*). The action of turning over 
a coin is being simulated by a single bit in a word of 
storage changing from zero to one or from one to zero. 

This is precisely the action of complementing, and is one 
of the chief uses of the exclusive OR operation, which is 
defined to be: 
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Thus, if the number OO0000001 is OR'd with a word in 
storage, the low-order bit of the other word will change 
from zero to one or from one to zero. 


The flowchart for subroutine T shows the complementing 
action being done the long way. This poses a problem; 
namely, how do the two approaches compare? The coding 
compares like this: 


As pictured in Using the 
flowchart T EOR command 


Load word X Load a one 


Branch on equal (zero) to EOR with word X 


Load a zero Store at X 


Store at X (Continue) 


Branch to 
Load a one 
Store at X 


(Continue) 


The coding on the left is simpler to follow; I would 
suggest such an approach during the debugging and testing 
phases of a new problem. Then, if it seems to pay, the 
shorter coding can be substituted; the program can then be 
re-tested, and the shortcut will then be effective. 


In this case, the time difference between the two modes 
of attack amounted to 5.8%. If the simple-minded approach 
shown on the flowchart for subroutine T is the one first used, 
then the programmer must decide whether or not it is worth it 
to recode and retest the program for a 5.8% gain in speed. 

It is axiomatic in the computing business that it is only 
after a program is tested and in production that its writer 
really understands how it should have been written. Thus, 
one's first attempt at a new program really should be just that: 
a trial program, made to be discarded. Looking at it slightly 
aifferently, a working program is an open invitation to write 
it again, only now we know how to do it right. The emergence 
of personal computers has made it possible for many people to 
enjoy this exquisite luxury. 


When this program is rewritten, the following improve- 
ments should be considered: 


1. Subroutines 3 and 4 (the action of performing 
flips of K coins on the top and bottom of a stack) should 
be each written as one continuous loop, instead of in 
pieces. As shown on Flowchart T, there are three 
distinct actions; namely, turning over the coins, copying 
the K words to another area of storage, and then copying 
them back in reverse order. This involved approach made 
the coding easier, but at the cost of inefficiency. 
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2. If some of the subroutines were written as open 
subroutines (that is, not linked to), some time could be 
saved, This improvement, however, is apt to be at the 
(1/10)% level. 


3. Only N words of the work area need be cleared 
for each new value of N. 


The functional values for the Penny Flipping II 
problem are now known through N = 112 and there is evidence 
that for N = 113, the result is greater than 28,000,000. 
The results are of no great intrinsic value (although the 
function, graphed in Figure K, is mysterious and hence 
intriguing), but the methods of producing correct programs 
are always worth exploring. 


Rather than rewrite the program for the Penny Flipping 
II Problem, it was more attractive to write a new program 
for the Penny Flipping IV Problem: 


Given a stack of N pennies, initially all 
sitting heads up. Flip the top penny, 
then the entire stack, then the top 2, then 
the entire stack,...,until the top K 
pennies are the entire stack; then start 
over with the top 1, the entire stack, 

the top 2,...,and so on. Count the 
number of flips to return the stack to 


all heads. 


The first running of this program produced results 
(and this fact establishes that most of the subroutines 
are working properly), but all the results were, again, 
wildly wrong (as compared to known results published in our 
issues 25 and 29). The MAIN flowchart (Figure H), it 
turns out, is illogical. It is left to the reader to ea 
spot the logical error. (Try N = 8, for example, and 
trace through the successive values of K that should satisfy 
the conditions of the problem.) [| 
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The cubes shown on these two 
pages illustrate graphically the 
famous statement of Srinivasa 

H. Hardy): 


Ramanujan (quoted by G. 


.. it (1729) is a very 


1 
interesting number; it is the 


sum of two cubes in two different 


smallest number expressible as a 
ways. 


Thus, we have: 


102 ct 9° = 123 SP 3 


It strikes 


me that this is as close as we can come to 


No smaller number 
absolute truth; Ramanujan's statement requires 
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no axioms or postulates and an absolute minimum 


of assumptions. 
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